Serializing C intermediate representations for efficient and portable parsing ( preprint ) Jeffrey

نویسندگان

  • Jeffrey A. Meister
  • Jeffrey S. Foster
  • Michael Hicks
چکیده

C static analysis tools often use intermediate representations (IRs) that organize program data in a simple, well-structured manner. However, the C parsers that create IRs are slow, and because they are difficult to write, only a few implementations exist, limiting the languages in which a C static analysis can be written. To solve these problems, we investigate two language-independent, on-disk representations of C IRs: one using XML, and the other using an Internet standard binary encoding called XDR. We benchmark the parsing speeds of both options, finding the XML to be about a factor of two slower than parsing C and the XDR over six times faster. Furthermore, we show that the XML files are far too large at 19 times the size of C source code, while XDR is only 2.2 times the C size. We also demonstrate the portability of our XDR system by presenting a C source code querying tool in Ruby. Our solution and the insights we gained from building it will be useful to analysis authors and other clients of C IRs. We have made our software freely available for download at http://www.cs.umd.edu/projects/PL/scil/.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Serializing C intermediate representations for efficient and portable parsing

C static analysis tools often use intermediate representations (IRs) that organize program data in a simple, well-structured manner. However, the C parsers that create IRs are slow, and because they are difficult to write, only a few implementations exist, limiting the languages in which a C static analysis can be written. To solve these problems, we investigate two language-independent, on-dis...

متن کامل

Serializing C intermediate representations for e cient and portable parsing ( preprint )

C static analysis tools often use intermediate representations (IRs) that organize program data in a simple, well-structured manner. However, the C parsers that create IRs are slow, and because they are di cult to write, only a few implementations exist, limiting the languages in which a C static analysis can be written. To solve these problems, we investigate two language-independent, on-disk ...

متن کامل

Serializing C Intermediate Representations to Promote Efficiency and Portability

C static analysis tools need access to intermediate representations (IRs) that organize program data in a well-structured manner. However, the C parsers that create IRs are slow, and they are not available for most languages. To solve these problems, we investigate two language-independent, on-disk representations of C IRs: one using XML, and the other using an Internet standard binary encoding...

متن کامل

The Incremental Design of Parallel Compiler Intermediate Representations using SPIRE

SPIRE is the first incremental methodology for designing the intermediate representations (IR) of compilers that target parallel programming languages. Its core philosophy is to extend in a systematic manner the IRs found in the compilation frameworks of sequential languages. Avoiding the often-used ad-hoc approach of encoding all parallel constructs as “fake” function calls, SPIRE enables the ...

متن کامل

Learning Representations for Text-level Discourse Parsing

In the proposed doctoral work we will design an end-to-end approach for the challenging NLP task of text-level discourse parsing. Instead of depending on mostly hand-engineered sparse features and independent components for each subtask, we propose a unified approach completely based on deep learning architectures. To train more expressive representations that capture communicative functions an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000